Skip to main content

Notes and Examples: The Linear Model

Simple Linear Regression In R

To test the effect of one variable on another, simple linear regression may be applied. The fitted model may be expressed as y=α+β^xy=\alpha + \hat{\beta} x, where α\alpha is a constant, β^\hat{\beta} is the estimated coefficient, and xx is the explanatory variable.

Fig. 40

Figure: Example taken from R of a fitted model using linear regression.

Details

Below is the linear regression output using the R's data set car. Notice that the output from the model may be divided into two main categories:

  1. output that assesses the model as a whole, and

  2. output that relates to the estimated coefficients for the model

Call: lm(formula = dist ~ speed, data = cars)

Residuals: Min 1Q Median 3Q Max -29.069 -9.525 -2.272 9.215 43.201

Coefficients: Estimate Std.
Error t value Pr(>|t|) (Intercept) -17.5791 6.7584 -2.601 0.0123 * speed 3.9324 0.4155 9.464 1.49e-12 *** ---

Residual standard error: 15.38 on 48 degrees of freedom Multiple R-squared: 0.6511, Adjusted R-squared: 0.6438 F-statistic: 89.57 on 1 and 48 DF, p-value: 1.490e-12

Notice that there are four different sets of output (Call, Residuals, Coefficients, and Results) for both the constant α\alpha and the estimated coefficient β^\hat{\beta} speed variable.

The estimated coefficients describe the change in the dependent variable when there is a single unit increase in the explanatory variable given that everything else is held constant.

The standard error is a measure of accuracy and is used to construct the confidence interval. Confidence intervals provide a range of values for which there is a set level of confidence that the true population mean will be within the given range. For example, if the CI is set at 95% percent then the probability of observing a value outside the given CI range is less than 0.05

The pp-value is represented as a percentage. Specifically, the pp-value indicates the percentage of time, given that your null hypothesis is true, that you would find an outcome at least as extreme as the observed value. If your calculated pp-value is 0.020.02 then 2% of the time you'd observe a mean at least as large as your observed

In the overall model assessment the R-squared is the explained variance over the total variance. Generally, a higher R2R^2 is better but data with very little variance makes it easy to achieve a higher R2R^2, which is why the adjusted R2R^2 is presented

Lastly, the FF-statistic is given. Since the tt-Statistic is not appropriate to compare two or more coefficients, the FF-statistic must be applied. The basic methodology is that it compares a restricted model where the coefficients have been set to a certain fixed level to a model which is unrestricted. The most common is the sum of squared residuals FF-test.

Multiple Linear Regression

The One-way Model

The one-way ANOVA model is of the form:

Yij=μi+ϵijY_{ij}=\mu_i+\epsilon_{ij}

or

Yij=μ+αi+ϵijY_{ij}=\mu+\alpha_i+\epsilon_{ij}

Details

The one-way ANOVA model is of the form:

Yij=μi+ϵijY_{ij}=\mu_i+\epsilon_{ij}

where YijY_{ij} is observation jj in treatment group ii and μi\mu_i are the parameters of the model and are means of treatment group ii. The ϵij\epsilon_{ij} are independent and follow a normal distribution with mean zero and constant variance σ2\sigma^2 often written as ϵN(0,σ2)\epsilon\sim N(0,\sigma^2).

The ANOVA model can also be written in the form:

Yij=μ+αi+ϵijY_{ij}=\mu+\alpha_i+\epsilon_{ij}

where μ\mu is the overall mean of all treatment groups and αi\alpha_i is the deviation of mean of treatment group ii from the overall mean. The ϵij\epsilon_{ij} follow a normal distribution as before

The expected value of YijY_{ij} is μi\mu_i as the expected value of the errors is zero, often written as E[Yij]=μiE[Y_{ij}]=\mu_i.

Examples

Example

In the rat diet experiment the model would be of the form:

yij=μi+ϵijy_{ij}=\mu_i+\epsilon_{ij}

where yijy_{ij} is the weight gain for rat jj in diet group ii, μi\mu_i would be the mean weight gain in diet group ii and ϵij\epsilon_{ij} would be the deviation of rat jj from the mean of its diet group.

Random Effects In the One-way Layout

The simplest random effects model is the one-way layout, commonly written in the form

yij=μ+αi+ϵij,y_{ij}=\mu + \alpha_i + \epsilon_{ij},

where j=1,,Jj =1,\ldots,J and i=1,,Ii =1,\ldots,I.

Normally one also assumes ϵijN(0,σA2)\epsilon_{ij}\sim N(0,\sigma_A^2), αiN(0,σA2)\alpha_i \sim N (0,\sigma_A^2), and that all these random variables are independent.

Note that we have stopped making a distinction in notation between random variables and measurements (the yy -values are just random variables when distributions occur).

Details

Note that this is considerably different from the fixed effect model.

Since the factor has changed to a random variable with an expected value of zero, the expected value of all the yy is the same:

Eyij=μEy_{ij}=\mu

The variance of yy now has two components:

Vyij=σA2+σ2Vy_{ij}=\sigma^2_A + \sigma^2

In addition we have a covariance structure between the measurements and this needs to be looked at in some detail.. First, the general case of a covariance between two general yijy_{ij} and yijy_{i'j'}, where the indices may or may not be the same:

Cov(yij,yij)=Cov(αi+ϵij,αi+ϵij)=E[(αi+ϵij)(αi+ϵij)]=E[αiαi]+E[ϵijαi]+E[αiϵij]+E[ϵijϵij]\begin{aligned} Cov(y_{ij},y_{i'j'}) &= Cov(\alpha_i+\epsilon_{ij}, \alpha_{i'}+ \epsilon_{i'j'}) \\ &= E[(\alpha_i+\epsilon_{ij})(\alpha_{i'}+\epsilon_{i'j'})] \\ &= E[\alpha_i\alpha_{i'}] + E[\epsilon{ij}\alpha_{i'}]+ E[\alpha_i\epsilon_{i'j'}] + E[\epsilon_{ij}\epsilon_{i'j'}] \end{aligned}
Note

Recall that E[UW]=E[U]E[W]E[UW]=E[U]E[W] if U,WU,W are independent

So

E[ϵijαi]=E[αiϵij]=EαiEϵij=0E[\epsilon_{ij}\alpha_{i'}]=E[\alpha_i\epsilon_{i'j'}] = E\alpha_iE\epsilon_{i'j'}=0

Further,

E[ϵijϵij]={σ2if i=i,j=j0otherwiseE[\epsilon_{ij}\epsilon{i'j'}] = \begin{cases} \sigma^2 & \text{if } i=i', j=j' \\ 0 & \text{otherwise} \end{cases}

and

E[αiαi]={σA2if i=i0if iiE[\alpha_{i}\alpha{i'}] = \begin{cases} \sigma^2_A &\text{if } i=i' \\ 0 &\text{if }i \neq i' \end{cases}

so

Cov(yij,yij)={σA2+σ2if i=i,j=jσAif i=i,jj0otherwiseCov(y_{ij},y_{i'j'}) = \begin{cases} \sigma_A^2+\sigma^2 & \text{if } i=i', j=j' \\ \sigma'_A & \text{if } i=i', j \neq j' \\ 0 & \text{otherwise} \end{cases}

It follows that the correlation between measurements yijy_{ij} and yijy_{ij'} (within the same group) are

Cor(yij,yij)=Cov(yij,yij)Var[yij]Var[yij]=σA2(σA2+σ2)2=σA2σA2+σ2\begin{aligned} Cor(y_{ij},y_{ij'}) &= \displaystyle\frac{Cov(y_{ij},y_{ij'})}{\sqrt{Var[y_{ij}]Var[y_{ij'}]}} \\ &= \displaystyle\frac{\sigma_A^2}{\sqrt{(\sigma_A^2 + \sigma^2)^2}} \\ &= \displaystyle\frac{\sigma_A^2}{\sigma_A^2 + \sigma^2} \end{aligned}

This is the intra-class correlation.

Linear Mixed Effects Models (lmm)

The simplest mixed effects model is

yij=μ+αi+βj+ϵijy_{ij} = \mu + \alpha_i + \beta_j + \epsilon_{ij}

where μ,α1,α2,,αi\mu, \alpha_1, \alpha_2, \ldots, \alpha_i are unknown constants,

βjN(0,σβ2)\beta_j \sim N(0,\sigma^2_\beta)

ϵijN(0,σ2)\epsilon_{ij} \sim N(0,\sigma^2)

(βj\beta_j and ϵij\epsilon_{ij} independent).

Details

The μ\mu and αi\alpha_i are the fixed effects and βj\beta_j is the random effects. Recall that in the simple one-way layout with yij=μ+αi+ϵijy_{ij} = \mu + \alpha_i + \epsilon_{ij}, we can write the model in matrix form y=Xβ+ϵ\underline{y} = X \underline{\beta} + \underline{\epsilon} where β=(μ,α1,,αI)\underline{\beta} = (\mu, \alpha_1, \ldots, \alpha_I)' and XX is appropriately chosen.

The same applies to the simplest random effects model yij=μ+βj+ϵijy_{ij}= \mu + \beta_j+ \epsilon_{ij} where we can write y=μ1+ZU+ϵ\underline{y} = \mu \cdot \underline{1}+ Z \underline{U} + \underline{\epsilon} where 1=(1,1,,1)\underline{1}=(1,1, \ldots, 1)', U=(β1,,βJ)\underline{U} = ( \beta_1, \ldots, \beta_J )'.

In general, we write the mixed effects models in matrix form with y=Xβ+ZU+ϵ\underline{y} = X \underline{\beta} + Z \underline{U} + \underline {\epsilon}, where β\underline{\beta} contains the fixed effects and U\underline{U} contains the random effects.

Examples

Example
  1. yi=β1+β2xi+ϵiy_i = \beta_1 + \beta_2 x_i + \epsilon_i (SLR)

  2. yij=μ+αi+βixij+ϵijy_{ij} = \mu + \alpha_i + \beta_i x_{ij} + \epsilon_{ij} only fixed effects (ANCOVA)

  3. yijk=μ+αi+bj+ϵijky_{ijk} = \mu + \alpha_i + b_j + \epsilon_{ijk} where αi\alpha_i are fixed but bjb_j are random.

  4. yijk=μ+αi+bjxij+ϵijky_{ijk} = \mu + \alpha_i + b_j x_{ij} + \epsilon_{ijk} where αi\alpha_i are fixed but bjb_j are random slopes.

Maximum Likelihood Estimation In Lmm

The likelihood function for the unknown parameters L(β,σA2,σ2)L(\boldsymbol{\beta},\sigma^2_A, \sigma^2) is

1(2π)n/2Σyn/2e1/2(yXβ)Σy1(yXβ)\displaystyle\frac{1}{(2\pi)^{n/2} \left| \boldsymbol{\Sigma}_y \right| ^{n/2}} e^{-1/2 (\mathbf{y}-X\boldsymbol{\beta})' \boldsymbol{\Sigma}^{-1}_y (\mathbf{y}-X\boldsymbol{\beta})}

where Σy=σA2ZZ+σ2I\boldsymbol{\Sigma}_y = \sigma^2_A Z Z' + \sigma^2 I

Maximising LL over β,σA2,σ2\boldsymbol{\beta},\sigma^2_A, \sigma^2 gives the variance components and the fixed effects. May also need u^\mathbf{\hat{u}}, this is normally done using BLUP.

Details

Recall that if WW is a random variable vector with EW=μEW = \mu and VW=ΣVW= \boldsymbol{\Sigma} then

E[AW]=AμE[AW] = A\mathbf{\mu}

Var[AW]=AΣAVar[AW]= A \boldsymbol{\Sigma} A'

In particular, if WN(μ,Σ)W \sim N(\mu, \boldsymbol{\Sigma}) then AWN(Aμ,AΣA)AW \sim N(A\mu, A \boldsymbol{\Sigma} A')

Now consider the lmm with

y=Xβ+Zu+ϵy = X \boldsymbol{\beta} + Zu + \boldsymbol{\epsilon}

where

u=(u1,,um)u = (u_1, \ldots, u_m)'

ϵ=(ϵ1,,ϵm)\boldsymbol{\epsilon} = (\epsilon_1, \ldots, \epsilon_m)'

and the random variables UiN(0,σA2)U_i \sim N(0, \sigma^2_A), ϵiN(0,σ2)\epsilon_i \sim N(0, \sigma^2) are all independent so that uN(0,σA2I)u \sim N(0, \sigma^2_A I) and ϵN(0,σ2I)\boldsymbol{\epsilon} \sim N(\mathbf{0}, \sigma^2 I).

Then Ey=XβEy = X\boldsymbol{\beta} and

Vy=Σy=Var[Zu+Var[ϵ]]=Z(σA2I)Z+σ2I=σA2ZZ+σ2I\begin{aligned} Vy &= \boldsymbol{\Sigma}_y \\ &= Var[Zu+Var[\boldsymbol{\epsilon}]] \\ &= Z(\sigma^2_A I) Z' + \sigma^2 I \\ &= \sigma^2_A Z Z' + \sigma^2 I \end{aligned}

and hence yN(Xβ,σA2ZZ+σ2I)y \sim N(X\boldsymbol{\beta},\sigma^2_A Z Z' + \sigma^2 I )

Therefore the likelihood function for the unknown parameters L(β,σA2,σ2)L(\boldsymbol{\beta},\sigma^2_A, \sigma^2) is

=1(2π)n/2Σyn/2e1/2(yXβ)Σy1(yXβ)= \displaystyle\frac{1}{(2\pi)^{n/2} \left| \boldsymbol{\Sigma}_y \right| ^{n/2}} e^{-1/2 (\mathbf{y}-X\boldsymbol{\beta})' \boldsymbol{\Sigma}^{-1}_y (y-X\boldsymbol{\beta})}

where Σy=σA2ZZ+σ2I\boldsymbol{\Sigma}_y = \sigma^2_A Z Z' + \sigma^2 I. Maximizing LL over β,σA2,σ2\boldsymbol{\beta},\sigma^2_A, \sigma^2 gives the variance components and the fixed effects. May also need u^\hat{u}, which is normally done using BLUP.